Learning Decision Trees from Histogram Data Using Multiple Subsets of Bins

نویسندگان

  • Ram B. Gurung
  • Tony Lindgren
  • Henrik Boström
چکیده

The standard approach of learning decision trees from histogram data is to treat the bins as independent variables. However, as the underlying dependencies among the bins might not be completely exploited by this approach, an algorithm has been proposed for learning decision trees from histogram data by considering all bins simultaneously while partitioning examples at each node of the tree. Although the algorithm has been demonstrated to improve predictive performance, its computational complexity has turned out to be a major bottleneck, in particular for histograms with a large number of bins. In this paper, we propose instead a sliding window approach to select subsets of the bins to be considered simultaneously while partitioning examples. This significantly reduces the number of possible splits to consider, allowing for substantially larger histograms to be handled. We also propose to evaluate the original bins independently, in addition to evaluating the subsets of bins when performing splits. This ensures that the information obtained by treating bins simultaneously is an additional gain compared to what is considered by the standard approach. Results of experiments on applying the new algorithm to both synthetic and real world datasets demonstrate positive results in terms of predictive performance without excessive computational cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Decision Trees From Histogram Data

When applying learning algorithms to histogram data, bins of such variables are normally treated as separate independent variables. However, this may lead to a loss of information as the underlying dependencies may not be fully exploited. In this paper, we adapt the standard decision tree learning algorithm to handle histogram data by proposing a novel method for partitioning examples using bin...

متن کامل

Steganalysis Method for LSB Replacement Based on Local Gradient of Image Histogram

In this paper we present a new accurate steganalysis method for the LSBreplacement steganography. The suggested method is based on the changes that occur in thehistogram of an image after the embedding of data. Every pair of neighboring bins of ahistogram are either inter-related or unrelated depending on whether embedding of a bit ofdata in the image could affect both bins or not. We show that...

متن کامل

Medical Image Segmentation Based on Mutual Information Maximization

In this paper we propose a two-step mutual informationbased algorithm for medical image segmentation. In the first step, the image is structured into homogeneous regions, by maximizing the mutual information gain of the channel going from the histogram bins to the regions of the partitioned image. In the second step, the intensity bins of the histogram are clustered by minimizing the mutual inf...

متن کامل

Evaluating association rules and decision trees to predict multiple target attributes

Association rules and decision trees represent two well-known data mining techniques to find predictive rules. In this work, we present a detailed comparison between constrained association rules and decision trees to predict multiple target attributes. We identify important differences between both techniques for such goal. We conduct an extensive experimental evaluation on a real medical data...

متن کامل

Improving Classification Accuracy Using Ensemble Learning Technique (Using Different Decision Trees)

Using ensemble methods is one of the general strategies to improve the accuracy of classifier and predictor. Bagging is one of the suitable ensemble learning methods. Ensemble learning is a simple, useful and effective metaclassification methodology that combines the predictions from multiple base classifiers (or learners). In this paper we show a comparative study of different classifiers (Dec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016